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SUMMARY We propose a novel network traffic matrix decomposi- 
tion metliod named Stable Principal Component Pursuit with Frequency- 
Domain Regularization (SPCP-FDR), which improves the Stable Princi- 
pal Component Pursuit (SPCP) method by using a frequency-domain noise 
regularization function. An experiment demonstrates the feasibility of this 
new decomposition method. 

key words: Traffic Matrix, Stable Principal Component Pursuit, 
Frequency-Domain Regularization, Iterative Algorithm. 

1. Introduction 

The network traffic matrix has been appHed to many signifi- 
cant application problems such as capacity planning, traf- 
fic engineering and anomaly detection. A traffic matrix 
combines diverse traffic components with distinct temporal 
properties. Therefore, it is necessary to decompose them ef- 
ficiently, and this problem is named traffic matrix structural 
analysis [1]. We presented the traffic matrix decomposi- 
tion model in [2] and decomposed a traffic matrix into three 
sub-matrices, which is equivalent to the generalized Robust 
Principal Component Analysis (RPCA) problem [3]. The 
results in [2] were achieved by applying the Stable Princi- 
pal Component Pursuit (SPCP) method in [3]. 

In this study, we improve the traffic matrix decomposi- 
tion method by using frequency-domain regularization. This 
method is a variation of SPCP, and is named Stable Prin- 
cipal Component Pursuit with Frequency-Domain Regular- 
ization (SPCP-FDR). We design the numerical algorithm for 
SPCP-FDR, evaluate its decomposition results on the Abi- 
lene dataset [7], and show that SPCP-FDR achieves more 
rational traffic decompositions compared with SPCP. 

2. Background 

The standard RPCA problem is formally defined in [4] : 
Problem 1 (Standard RPCA) Suppose that a known matrix 
X 6 R'"^" is of the form A -\- E, where A and E are un- 
known matrices. It is assumed that A has a low rank and E 
is sparse. The problem is to recover A and E. 
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In many applications, data matrices satisfying the as- 
sumptions in Problem 1 are polluted by dense noise with 
small magnitude. This leads to the generalized RPCA [3]: 
Problem 2 (Generalized RPCA) Suppose that a known ma- 
trix X € W"^" is of the form A + E -\- N, where A, E and N 
are unknown matrices. It is assumed that A has a low-rank, 
E is sparse, and N is an i.i.d. entry-wide noise matrix with 
small magnitude. The problem is to recover A, E and N. 

We refer RPCA as the generalized version as follows. 
Zhou et al. [3] proved that under surprising board condi- 
tions, for "almost all" data matrix X which is the sum of a 
low-rank matrix and a sparse matrix, and is corrupted by a 
dense noise matrix A^ whose Frobenius norm \\N\\f < S {6 is 
a positive constant), one could stably estimate A and E with 
high probability using this convex program: 



rnin||A||.+^||£||i 

A,E 



s.t.\\X - A -\- E\\f < 6, 



(1) 



where A is a positive parameter, || ■ ||, and || ■ ||i denote the 
matrix nuclear norm and the /i norm, respectively. They 
named this method as Stable Principal Component Pursuit 
(SPCP), which needs to solve a time-consuming constrained 
program. Thus they turned to solve another similar uncon- 
strained program instead (r is another positive parameter): 



minrdlAlU 

A,E 



+ A\\E\U)+^\\X- 



E\\l. 



(2) 



Suppose X e 



pTxP 



is a traffic matrix, and each col- 



umn Xj 6 



^ (1 < y < P) is the traffic time-series of an 
Original-Destination (OD) flow. Following the traffic ma- 
trix decomposition model [2], X is the sum of three sub- 
matrices: (1) The deterministic traffic matrix A is a low- 
rank matrix, contributed by the periodic traffic changes in 
each OD flow; (2) The anomaly traffic matrix £ is a sparse 
matrix, but its nonzero entries may have large magnitudes; 
(3) The noise matrix A^ is constituted of independent ran- 
dom variables with relatively small magnitudes, the entries 
in one column constitute a white noise vector, but those in 
different columns have distinct variances. In [2], we esti- 
mated the noise traffic variances {cr,)^ of all the OD flows 
in X, and divided Xj by ctj (l < j < P). This preprocessing 
normalizes random variables in A^, and preserves the rank of 
A, as well as the sparsity of E. Thus traffic matrix decom- 
position is equivalent to RPCA, and we only consider traffic 
matrices with unit noise traffic variance. 
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3. Stable Principal Component Pursuit with Frequency- 
Domain Regularization 

Suppose "®" is a traffic matrix decomposition method, and 
A^® 6 E"^^^ is the decomposed noise traffic matrix of X by 
®. In this letter, SPCP and SPCP-FDR are denoted as "O" 
and "©", respectively. We consider the frequency-domain 
property of the decomposed noise traffic matrices. For each 
noise traffic time-series A^® (the j-th column of A^®), aj e 
C^ denotes its Discrete Fourier Transform (DFT): 



•'J J 



(3) 



where W e C''^^''^ is the discrete Fourier basis matrix. The 
f-th column vector W, € C^ (1 < f < T) is defined as; 






Ht~\)(k^\) 



k= i,2,...,r. 



A^® 's spectral density ipj € MJ is defined as: 



(pjit)^\ajit)\- f=l,2,...,r. 



(4) 



(5) 



AsA^® 



t + 2) for each po- 



. is a real signal, fjit) - (pj{T 
sition t e [2, T] n N, and (T - t + 2) is named the dual 
position of t. The spectra (pj(t) at positions close to 2 or T 
describe A^®'s power distributed in low-frequency domain; 
conversely, the spectra at positions close to (^ 4- 1) indicate 
the high-frequency power Furthermore, the spectral density 
<I>^e e M^ of A^® is defined as the entry-wise sum of all the 
noise traffic time-series' spectral density: 



'^MO ^ Yu'^jit) f=i,2,...,r. 



(6) 



j=i 



We compute the spectral density of eight noise traffic 
matrices NOl® ~ N08® and display them in Fig. 1, which 
are independently decomposed from the eight Abilene [7] 
weekly traffic matrices XOl ~ X08 by SPCP we adopted in 
[2]. In this dataset, P - 121, T = 2016, and the minimal 
time interval is 5 minutes. Thus for each noise traffic ma- 
trix NOx® (x e {1,2, ...,8)), <[)n()xo(0 represents the power 
captured by periodic traffic with period n^^(,_iVm6-,+2) i-) 
hour(s). We discover these two properties for the noise traf- 
fic matrices decomposed by SPCP: (1) The low-frequency 
spectra are generally much larger than the high-frequency 
spectra. This means that most energy of these noise traf- 
fic matrices is contributed by the low-frequency traffic pat- 
terns; (2) Quite a few low-frequency spectra have dramati- 
cally larger values than their neighbors (No. 8, 15, 29, 57, 
113 and 169 spectra, as well as the spectra at dual positions, 
and their corresponding periods are 24, 12, 6, 3, 1.5 and 1 
hour(s), respectively). 

These properties match our assumptions of the noise 
traffic poorly. As each column of the noise traffic matrix is 
assumed as a white noise vector, if the noise traffic is exactly 




Fig. 1 The spectral density of N01° ~ N08° (left); For each noise traffic 
matrix, the first 200 positions of its spectral density are specially magnified 
(right), which describe the low-frequency power 

recovered by SPCP, the spectral density should show a flat 
distribution. We briefly explain the cause of this limitation. 
In most cases, the volume of a traffic matrix is mainly con- 
tributed by the deterministic traffic. The deterministic traffic 
changes slowly and shows typical diurnal pattern. There- 
fore, the spectral density of each OD flow presents unbal- 
anced distribution: The low-frequency spectra usually have 
much larger values, and the spectra whose periods are the 
factors of 24 hours show strong peaks. The optimization 
problem (2) can be written in this equivalent form: 



min||A||.+i||£||i+rl|A^||^ 

A,E,N 



s.tA +E+N^X, 



(7) 



where y = 1/2t, and the objective function for A^ is 
the Frobenius norm. Thus problem (7) has no frequency- 
domain consideration on the noise traffic. With high proba- 
bility, the noise traffic decomposed by SPCP inherits the un- 
balanced spectral density distribution of the OD flow traffic. 
These discussions motivate us to replace ||A^||^ by a new 
objective function which focuses on A^'s frequency-domain 
property, and the new decomposed noise traffic should have 
more flat spectral density. We define the frequency-domain 
weight matrix C = diag(ci, ..., cj-) satisfying: 



Q>0, f = 1,2, ...,r; J]cj ^ 



(8) 



i=\ 



We propose the new traffic decomposition method called 
Stable Principal Component Pursuit with Frequency- 
Domain Regularization (SPCP-FDR). Compared with 
SPCP, SPCP-FDR adopts the same objective functions for 
the deterministic and the anomaly traffic, respectively; while 
IICW^A^IIf is chosen as the new objective function for the 
noise traffic (C's design is presented in Section 4): 

min||A|U+i||£||i+r||CW^A^||l s.t.A + E + N^X, (9) 

A.E.N 

where A and y are positive parameters balancing the three 
objective functions. We choose A - Ij -\/max{T, P) because 
it is demonstrated as a proper choice in [4]. For the choice 
of y, consider the simplest case: each column in A^ is a 
standard Gaussian white noise. The DFT of one Gaussian 
white noise is also a Gaussian white noise, and it can be 
proved that E [||CH^t^||^] = E [||A^||^] (E denotes expecta- 
tion). Therefore, SPCP-FDR can be seen as an approxima- 
tion of SPCP under the Gaussian white noise assumption. In 
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[2], we choose t = -y/2 log(rP) max{T, P) for problem (2). 
Thus in this study y = 1/2t = 1/ (2 ^JT\o^fpjmzx{J\Pfj. 



A,E.N ,,,,,/ 



4. Implementation Details 

4. 1 The Design of Frequency-Domain Weight Matrix 

We design C for decomposing the Abilene weekly traffic 
matrices (T - 2016), and we believe that the key technolo- 
gies are adaptable to other datasets with small modifications. 
Instinctively, it punishes the low-frequency spectra, espe- 
cially for the spectra whose corresponding periods are the 
factors of 24 hours. Define the position set 5 1 as: 

S'l = {8,15,29,57,113,169); 5* = lt\{T-t + 2)eS'l} 
5 J represents the dual positions of S". We design {c,),^[ as: 
i/3[v(t)+p] teSi 



Ct 



fivif) 



otherwise 



(10) 



where v{x) is a positive function in [\,T] satisfying v{x) - 
v{T -x+2) in [2, T]. Meanwhile, it decreases monotonically 
in [1, ^ -I- 1]. p > is an additional penal parameter for the 
spectra correlated to 5 1 . yS > is a scaling parameter In this 
study, we choose 



Ug-Too- -Hi xe[\,^ + \) 

v{x) = { (r-^+i) T- - 

l4e-^55- -Hi xe[^ + l,T] 



(11) 



and p = 2. At last, we get /3 = 0.4832 from assumption (8) 
for this special choice of v{x) and p. 



= min-^||A-G^||^+/z||A||.+ 

A / 

min^||£-G^||2+^^||£||,+ 

E I 

min -^||A^ - G'^lli -HjuyllCW^A^Ill + constant, 
N 2 



(14) 



where 



G° = y° (y^ + y'^ + y^-X), De{A,E,N]. (15) 

Problem (14) splits into three independent optimization 
problems, and the first two (for A and E) are well studied 
[5]. We give the solution to the third problem by derivation: 



N^Lf [LjItxt + 2yLjWC-W^\ ' G^ 



(16) 



Following the main idea in [5], we present the APG 
algorithm for the SPCP-FDR method in Algorithm 1 . This 
algorithm has the (9(1 /fc~) convergence rate. Because the 
proof is very close to Theorem 2.1 in [5] and Theorem 4.4 
in [6], we omit it and directly summarize this result: 
Theorem 1 Let F{A, E, N) = Jlg(A, E, N)+f{A, E, N). Then 
for all k>ko = [log [f) I log (i)], we have 

F{Xk) - F(X*) < 6\\Xk, - X*\\ll{k -ko+ l)\ (17) 

where X^ = (A^, E^, N^) is defined in Algorithm 1, and X* = 
(A*, E*,N*) is a solution to (12) when fi -Ji. Notice that in 
this study L/ = 3, while in [5] it equals to 2. 



4.2 The APG Algorithm for SPCP-FDR 

The Accelerated Proximal Gradient (APG) algorithm for 
SPCP-FDR solves a relaxed approximation problem of (9): 

min FiA, E, N) = min ufl(A, E, N) + /(A, E, N) 

A,E.N AM,N^^ J V ' ' y 

g{A,E,N)^\\A\l+A\\Eh+y\\CW''N\\l, (12) 

f{A,E,N)^]^\\A + E + N-X\\l, 

where yu > is a parameter As yu — > 0, the solu- 
tion to (12) approaches to the solution to (9). Not di- 
rectly minimizing F{A, E, N), this algorithm minimizes a se- 
quence of quadratic approximations Q{A, E,N, Y'^, Y^, 7") 
to F(A, E, N) at point (y*, Y^, Y^) (renewed in each step): 

QiA,E,N,Y'^,Y'^,Y'^) 



=fig{A,E,N) + fiY'^, Y'^, Y^)+ 

/A vE vN 



{VfiY'', Y", y™), (A, E,N) - (y^, Y", y™))+ 



(13) 



L 



'/, 



-^\\{A,E,N)-{Y\Y\Y^)r^, 



Algorithm 1 The APG Algorithm for SPCP-FDR 

Input: traffic matrix X e MJ^'' with unit noise variance. 

Initialization: Aq = A_i = Eq = £_! = No = A'-i = 0^^''; to = f-i = 1; 

lio = O.99IIXII2; j; = 10- Vo;<: = 0. 

Tl = 0.9; Lf = 3; A = 1/ ^Jmax{T,P);y = 1 / (2 ^2 logiTP) max(T, P)). 

While not converged do 

^t='^i' + ^Wa. - At-&, Yf = Et + '-^{Ek - Ek-i); 

Yjf = Nt + '-^{Nk - Nt^i); 

<^° = ^° -TJ^^t + ^k +Y^-X),ne lA, E, N]; 

(U,S, V) = svd Gi ; //svd[-] denotes singular value decomposition. 

Ak+i = US,, [S]V'^;Ek+i =Sm [cf]; 

//5e[-] denotes soft-thresholding operator with threshold e > 0. 

Nk+i = Lf [L//rxr + iMkyWC^W^l' G^ 






where the Lipschitz constant Lj -1>. It can be derived that: 



tk+i = (1 + ^4r2 + l)/2; Hk+\ = max(r;//t,;u); k = k+\. 
End wliile 
Output: A=Ak\E = Ek\N = Ek. 

5. Experiment Results 

We evaluate the SPCP-FDR method by using the Abilene 
traffic matrices XOl ~ X08. For each x 6 {1,...,8), 
suppose XOx is decomposed as {A0x®,E0x®, NOx®) and 
{AOx®, EOx®, NOx'3) by SPCP-FDR and SPCP respectively. 
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Fig. 2 The spectral density of NOl® ~ N()8® decomposed by SPCP- 
FDR, compared with the spectral density of N01° ~ N08° decomposed by 
SPCP (left); For each matrix, the first 200 positions of its spectral density 
are specially magnified (light), which describe the low-frequency power 

Table 1 A comparison between SPCP and SPCP-FDR on the decom- 
posed deterministic traffic matrices. 



Traffic matrix 


rank(XOx) 


rank(AOx°) 


rank(AOx®) 


||AOx» p 
IIAOxH f 


XOl (X = 1) 


121 


10 


19 


1.0158 


X02 (X = 2) 


121 


11 


22 


1.0171 


X03 (x = 3) 


121 


12 


20 


1.0283 


X04 (x = 4) 


121 


11 


18 


1.0145 


X05 (x = 5) 


121 


10 


19 


1.0148 


X06 (x = 6) 


121 


10 


22 


1.0162 


X07 (x = 7) 


121 


13 


24 


1.0210 


X08 (x = 8) 


121 


12 


23 


1.0182 



Figure 2 compares the spectral density of the noise traf- 
fic matrices NOl® ~ N08® (blue) and NOl® ~ N08® (black). 
For each x e {1, ..., 8), NOx® has more flat spectral density 
distribution than NOx®. As we further regularize noise traf- 
fic's spectra whose positions lie in 5 1, their magnitudes dra- 
matically decline. Therefore, for the new decompositions 
of XOl ~ X08 by SPCP-FDR, most low-frequency traf- 
fic pattern is efficiently eliminated from the resulting noise 
traffic matrices NOl® ~ N08®. Table 1 compares the deter- 
ministic traffic matrices AOl® ~ A08® and AOP ~ A08®. 
The ranks of AOl® ~ A08® decomposed by SPCP-FDR are 
nearly twice as those decomposed by SPCP. However, as all 
these values do not exceed 24 and the original traffic matri- 
ces' ranks are 121, AOl® ~ A08® still satisfy the low-rank 
characteristic. For each x e {1,...,8), the Frobenius norm 
of AOx® is a little larger than that of AOx®, and it can be 
explained that most low-frequency traffic pattern eliminated 
from the noise traffic is added to the deterministic traffic. 




Fig. 3 A comparison between SPCP and SPCP-FDR on the absolute 
Pearson coiTelation coefficients (between the deterministic traffic time- 
series and the noise traffic time-series of each OD flow) of XOl . 

As a case study, we analyze ffie decomposition result of 
XOl in detail. For each OD flow (XOl); (1 < j< 121), com- 
pute the absolute Pearson correlation coefficient between 
(A01®)y and (N01®)y, and ffie coefficient between (AOl®); 
and (N01®)j. Figure 3 displays these coefficients arranged 
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Fig. 4 A comparison between SPCP and SPCP-FDR on the decomposi- 
tion result of OD flow (X01)i2i. Upper panel: the deterministic and the 
noise traffic time-series decomposed by SPCP; Bottom panel: the deter- 
ministic and the noise traffic time-series decomposed by SPCP-FDR. 

by flow ID. This is a proper metric of the cross-correlation 
and it should be as small as possible. For most OD flows 
in XOl, compared with SPCP SPCP-FDR's absolute Pear- 
son correlation coefficients show significant decline. Actu- 
ally, X02 ~ X08 show quite similar results. Thus SPCP- 
FDR could efficiently reduce this cross-correlation. Figure 
4 compares SPCP-FDR with SPCP on decomposition result 
of the No. 121 OD flow, which produces the largest volume 
in XOl. Instinctively, ffie noise traffic time-series (N01®)i2i 
contains a distinct diurnal trend, which should be classified 
as the deterministic traffic better. By contrast, (N01®)i2i 
presents a more stable temporal pattern. Thus SPCP-FDR 
gives a more rational decomposition for this flow. 

6. Conclusions 

This letter presents a novel traffic matrix decomposition 
method named SPCP-FDR, which improves SPCP by us- 
ing frequency-domain regularization. We propose an APG 
algorithm for SPCP-FDR, its convergence rate is 0(1 /k~), 
and demonstrate it on a real-world dataset. SPCP-FDR ef- 
ficiently eliminates the low-frequency periodic traffic from 
the noise traffic, maintains deterministic traffic's low-rank 
property, and shows lower cross-correlation between these 
two kinds of traffic than SPCP Therefore, SPCP-FDR 
achieves more rational traffic decompositions. 
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